Design, Implementation, and Performance of MPI on Portals 3.0
نویسندگان
چکیده
The emergence of cluster computing as a viable platform for high performance computing has been realized due to significant performance increases in commodity computing and networking hardware. In particular, relatively inexpensive programmable network interface cards (NICs), such as Myrinet (Boden et al., 1995), that are capable of delivering gigabit-per-second speeds, have allowed for much research into low-level message passing protocols and message passing interfaces (von Eicken et al., 1992, 1995; Ishikawa et al., 1996; Pakin et al., 1997; Myricom, Inc., 1997; Compaq et al., 1997; Prylli and Tourancheau, 1998). Most of this research has been focused on delivering latency and bandwidth performance as close as possible to the limitations of the hardware. In several aspects, the research on clusters of personal computers (PCs) with gigabit networking hardware is addressing many of the same problems that proprietary distributed-memory message passing parallel machines of the early 1990s faced. Despite the differences in hardware architecture between custom-built parallel machines and today’s PC cluster, many of the issues with respect to delivering network performance to parallel applications are similar. The Portals (Brightwell et al., 1999, 2002) data movement interface (Portals 3.0) is an evolution of networking technology initially developed for large-scale, distributed memory, massively parallel systems. Portals began as a key component of our lightweight compute node operating systems (Maccabe et al., 1994; Shuler et al., 1995), and has evolved into a functional interface that can be implemented efficiently for different operating systems and networking hardware. In particular, Portals provides the necessary building blocks for higher-level protocols to be implemented on programmable or intelligent network interfaces without providing mechanisms that are specific to each higher-level protocol. This paper describes how these building blocks and their associated semantics can be combined to support the protocols needed for a scalable, high performance implementation of the Message Passing Interface (MPI) Standard (MPI Forum 1994). Portals is the basis for the Computational Plant (Cplant) (Brightwell et al., 2000) cluster at Sandia National Laboratories, and the MPI implementation described in this paper has been used on our large-scale production machines for the last two years.
منابع مشابه
Design and Implementation of MPI on Portals 3.0
This paper describes an implementation of the Message Passing Interface (MPI) on the Portals 3.0 data movement layer. Portals 3.0 provides low-level building blocks that are flexible enough to support higher-level message passing layers such as MPI very efficiently. Portals 3.0 is also designed to allow for programmable network interface cards to offload message processing from the host process...
متن کاملOn the Current State of Open MPI on Cray Systems
Open MPI provides an implementation of the MPI standard supporting native communication over a range of high-performance network interfaces. Los Alamos National Laboratory (LANL) and Oak Ridge National Laboratory (ORNL) collaborated on creating a port for Cray XE and XK systems. That work has continued and with the release of version 1.8 Open MPI now conforms to MPI-2.2 and MPI-3.0 on Cray XE, ...
متن کاملAn implementation and evaluation of the MPI 3.0 one-sided communication interface
The Message Passing Interface (MPI) 3.0 standard includes a significant revision to MPI’s remote memory access (RMA) interface, which provides support for one-sided communication. MPI-3 RMA is expected to greatly enhance the usability and performance of MPI RMA. We present the first complete implementation of MPI-3 RMA and document implementation techniques and performance optimization opportun...
متن کاملPorting a Vector Library: a Comparison of MPI, Paris, CMMD and PVM
This paper describes the design and implementation in MPI of the parallel vector library CVL, which is used as the basis for implementing nested data-parallel languages such as NESL and Proteus. We outline the features of CVL, and compare the ease of writing and debugging the portable MPI implementation with our experiences writing previous versions in CM-2 Paris, CM-5 CMMD, and PVM 3.0. We giv...
متن کاملPortals 3.0: Protocol Building Blocks for Low Overhead Communication
This paper describes the evolution of the Portals message passing architecture and programming interface from its initial development on tightly-coupled massively parallel platforms to the current implementation running on a 1792-node commodity PC Linux cluster. Portals provides the basic building blocks needed for higher-level protocols to implement scalable, low-overhead communication. Portal...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IJHPCA
دوره 17 شماره
صفحات -
تاریخ انتشار 2003